Characteristic Patch Sizes in DNA Sequences
نویسندگان
چکیده
We develop techniques for studying characteristic length scales in DNA sequences and apply these to the analysis of long genomic sequences including all available human sequences longer than 100000 bp and the nine sequenced yeast chromosomes. We nd evidence suggesting the existence of a hierarchy of characteristic length scales in all the genomic DNA sequences analyzed. In particular, we nd similar patch sizes in all nine yeast chromosomes, and some patch sizes exist in several organisms. We examine the possibility that in yeast the patchiness is caused by the alternation of coding and noncoding DNA sequences. We also examine that in human sequences the patchiness is related to repetitive sequences. However, we conclude that neither repetitive sequences nor the alternation of coding and noncoding DNA can fully explain the the mosaic structure of DNA.
منابع مشابه
Quantification of DNA patchiness using long-range correlation measures.
We introduce and develop new techniques to quantify DNA patchiness, and to quantify characteristics of its mosaic structure. These techniques, which involve calculating two functions, alpha(l) and beta(l), measure correlations at length scale l and detect distinct characteristic patch sizes embedded in scale-invariant patch size distributions. Using these new methods, we address a number of iss...
متن کاملمقایسه ساختار اراضی مرتعی و میزان تخریب پیوستگی سیمای سرزمین در زیرحوزههای آبخیز ایریل، استان اردبیل
The aim of this research was to compare the rangelands structure and landscape degradation in Iril sub-watersheds, Ardabil province. Land use map and the Landscape Fragmentation Tool were used to determine 4 classes of patch, edge, perforated and core through 5, 10, 15 and 20m cell sizes. The landscape metrics of LPI, ED, cohesion, mesh, split, and AI were calculated using Fragstats software. R...
متن کاملConstruction of vaccine from Lactococcus lactis bacteria using Aeromonas hydrophila virulent Aerolysin gene
In this study the forward and reverse primers were designated to amplify the segments (~250 bps and ~650 bps) of the gene coding domains 1 and 4 of aerolysin of Aeromonas hydrophila. These two domains are involved in pathogenesis of the aerolysin gene. Sequences for two restriction enzymes, Pst I and Hind III, were included in the forward and reverse primers respectively. These restriction enz...
متن کاملCompositional segmentation and long-range fractal correlations in DNA sequences.
A segmentation algorithm based on the Jensen-Shannon entropic divergence is used to decompose longrange correlated DNA sequences into statistically significant, compositionally homogeneous patches. By adequately setting the significance level for segmenting the sequence, the underlying power-law distribution of patch lengths can be revealed. Some of the identified DNA domains were uncorrelated,...
متن کاملStatistical analysis of large DNA sequences using distribution of DNA words
Conventional sequence alignment techniques for comparing and analysing relatively smaller DNA sequences of nearly equal sizes are not applicable to data consisting of large sequences with widely varying sizes. In this article DNA sequences have been analysed based on distributions of DNA words. DNA word frequencies are simple yet effective statistical tools to capture information about structur...
متن کامل